AITopics

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.59)

Neural Information Processing SystemsFeb-11-2026, 13:45:25 GMT

3d03800841fa1bb2f43ef1750aafcce4-Paper-Conference.pdf

large language model, machine learning, natural language, (21 more...)

Country:

Asia > Japan (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > Vietnam (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Neural Information Processing SystemsFeb-11-2026, 02:02:25 GMT

f169b1a771215329737c91f70b5bf05c-AuthorFeedback.pdf

alignment, artificial intelligence, machine learning, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

Jajal, Purvish, Eliopoulos, Nick John, Chou, Benjamin Shiue-Hal, Thiruvathukal, George K., Davis, James C., Lu, Yung-Hsiang

Inference-Time Alignment of Diffusion Models via Evolutionary Algorithms

arXiv.org Artificial IntelligenceNov-27-2025

Diffusion models are state-of-the-art generative models, yet their samples often fail to satisfy application objectives such as safety constraints or domain-specific validity. Existing techniques for alignment require gradients, internal model access, or large computational budgets resulting in high compute demands, or lack of support for certain objectives. In response, we introduce an inference-time alignment framework based on evolutionary algorithms. We treat diffusion models as black boxes and search their latent space to maximize alignment objectives. Given equal or less running time, our method achieves 3-35% higher ImageReward scores than gradient-free and gradient-based methods. On the Open Image Preferences dataset, our method achieves competitive results across four popular alignment objectives. In terms of computational efficiency, we require 55% to 76% less GPU memory and are 72% to 80% faster than gradient-based methods.

artificial intelligence, evolutionary algorithm, machine learning, (13 more...)

2506.00299

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre:

Research Report (0.82)
Overview (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Piao, Shengmin, Park, Sanghyun

SpiralThinker: Latent Reasoning through an Iterative Process with Text-Latent Interleaving

arXiv.org Artificial IntelligenceNov-13-2025

Recent advances in large reasoning models have been driven by reinforcement learning and test-time scaling, accompanied by growing interest in latent rather than purely textual reasoning. However, existing latent reasoning methods lack mechanisms to ensure stable evolution of latent representations and a systematic way to interleave implicit and explicit reasoning. We introduce SpiralThinker, a unified framework that performs iterative updates over latent representations, enabling extended implicit reasoning without generating additional tokens. A progressive alignment objective combined with structured annotations maintains coherence between latent and textual reasoning. Across mathematical, logical, and commonsense reasoning tasks, SpiralThinker achieves the best overall performance among latent reasoning approaches, consistently surpassing previous methods across all benchmarks. Detailed analyses reveal that both iteration and alignment are indispensable, the numbers of latent tokens and iterations exhibit dataset-specific optima, and appropriate alignment proves critical for an effective iterative process. Overall, SpiralThinker bridges iterative computation and latent reasoning, demonstrating that aligned iterative updates can reliably steer reasoning in the latent space.

large language model, machine learning, natural language, (20 more...)

2511.08983

Country: North America > United States (0.93)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Neural Information Processing SystemsOct-9-2025, 23:54:35 GMT

MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models Kailai Yang

Recent advancements in large language models (LLMs) focus on aligning to heterogeneous human expectations and values via multi-objective preference alignment.

alignment, metaaligner, objective, (15 more...)

Country:

Asia > Japan (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > Vietnam (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-30-2025

Discrete Diffusion Trajectory Alignment via Stepwise Decomposition

Han, Jiaqi, Wang, Austin, Xu, Minkai, Chu, Wenda, Dang, Meihua, Yue, Yisong, Ermon, Stefano

Discrete diffusion models have demonstrated great promise in modeling various sequence data, ranging from human language to biological sequences. Inspired by the success of RL in language models, there is growing interest in further improving the models by alignment with a certain reward. In this work, we propose an offline preference optimization method to approach trajectory alignment for discrete diffusion models. Instead of applying the reward on the final output and backpropagating the gradient to the entire denoising process, we decompose the problem into a set of stepwise alignment objectives by matching the per-step posterior. This framework enables efficient diffusion optimization, is compatible with arbitrary reward functions, and importantly, yields an equivalent optimal solution under additive factorization of the trajectory reward. Experiments across multiple domains including DNA sequence design, protein inverse folding, and language modeling consistently demonstrate the superiority of our approach. Notably, it achieves an up to 12\% improvement over the most competitive RL-based baseline in terms of predicted activity on DNA sequence design, and further improves the GSM8K score from 78.6 to 81.2 on LLaDA-8B-Instruct for language modeling.

diffusion model, machine learning, natural language, (19 more...)

2507.04832

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.87)

Neural Information Processing SystemsAug-17-2025, 06:00:52 GMT

their thoughtful suggestions, which we will incorporate into the final improved version of this submission

We would like to thank reviewers for their time and the effort they put into reviewing our submission. Such findings are better explained with simpler, controlled experiments. Q3: (R1) Local optima of LRMF wrt Th 2.3 (3), when does the equality hold? LRMF from converging to the actual value of LR-distance, i.e. the statement of the theorem always holds, but the final We will explicitly acknowledge this in the paper. "difficult to quantitatively reason about the performance of [GANs]" The choice of Real NVP vs GLOW is dictated by the same principle.

alignment, submission, thoughtful suggestion, (13 more...)

Genre: Research Report > Experimental Study (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

arXiv.org Artificial IntelligenceAug-12-2025

Gradient Surgery for Safe LLM Fine-Tuning

Yi, Biao, Li, Jiahao, Zhang, Baolei, Nie, Lihai, Li, Tong, Huang, Tiansheng, Liu, Zheli

Fine-tuning-as-a-Service introduces a critical vulnerability where a few malicious examples mixed into the user's fine-tuning dataset can compromise the safety alignment of Large Language Models (LLMs). While a recognized paradigm frames safe fine-tuning as a multi-objective optimization problem balancing user task performance with safety alignment, we find existing solutions are critically sensitive to the harmful ratio, with defenses degrading sharply as harmful ratio increases. We diagnose that this failure stems from conflicting gradients, where the user-task update directly undermines the safety objective. To resolve this, we propose SafeGrad, a novel method that employs gradient surgery. When a conflict is detected, SafeGrad nullifies the harmful component of the user-task gradient by projecting it onto the orthogonal plane of the alignment gradient, allowing the model to learn the user's task without sacrificing safety. To further enhance robustness and data efficiency, we employ a KL-divergence alignment loss that learns the rich, distributional safety profile of the well-aligned foundation model. Extensive experiments show that SafeGrad provides state-of-the-art defense across various LLMs and datasets, maintaining robust safety even at high harmful ratios without compromising task fidelity.

large language model, machine learning, natural language, (18 more...)

2508.07172

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Neural Information Processing SystemsMay-26-2025, 22:03:15 GMT

MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models

Recent advancements in large language models (LLMs) focus on aligning to heterogeneous human expectations and values via multi-objective preference alignment. However, existing methods are dependent on the policy model parameters, which require high-cost repetition of their alignment algorithms for each new policy model, and they cannot expand to unseen objectives due to their static alignment objectives. In this work, we propose Meta-Objective Aligner (MetaAligner), the first policy-agnostic and generalizable method for multi-objective preference alignment.MetaAligner models multi-objective alignment into three stages: (1) dynamic objectives reformulation algorithm reorganizes traditional alignment datasets to supervise the model on performing flexible alignment across different objectives; (2) conditional weak-to-strong correction paradigm aligns the weak outputs of fixed policy models to approach strong outputs with higher preferences in the corresponding alignment objectives, enabling plug-and-play inferences on any policy models, which significantly reduces training costs and facilitates alignment on close-source policy models; (3) generalizable inference method flexibly adjusts target objectives by updating their text descriptions in the prompts, facilitating generalizable alignment to unseen objectives.Experimental results show that MetaAligner achieves significant and balanced improvements in multi-objective alignments on 10 state-of-the-art policy models, and saves up to 93.63% of GPU training hours compared to previous alignment methods.

artificial intelligence, large language model, natural language, (8 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)